PS3 GPU Full VRAM/IO access exploit
AuthorAlexandro Sanchez Date2016-03-16
Introduction
During the early development of the PlayStation 3 emulator project Nucleus, it was decided to do a high-level emulation of the PlayStation 3 kernel known as CellOS Lv-2, often shortened to LV2. This implied reverse engineering and reimplementing the kernel, and intercept the syscalls used by the user-mode applications. The correct reimplementation of a certain group of syscalls, the kernel-level RSX driver interface with prefix sys_rsx
, was crucial to the success of the GPU emulation. Additionally, these syscalls are a thin wrapper of the actual hypervisor-level RSX driver, accessible through the lv1_gpu
syscalls.
Between February 2016 and March 2016, the developer @3141card reverse engineered the RSX driver code found in both layers. These sources, combined with the documentation and headers from the Envytools/Nouveau projects and advice from @mwk eased the security analysis, resulting in the vulnerability presented here.
Reality Synthesizer
The Reality Synthesizer, commonly shortened to RSX, is the PlayStation 3 GPU and is composed of multiple engines. Gross over-simplifications take place throughout this section for the sake of readability. RSX exposes 3 Base Address Registers (BARs):
BAR | Offset | Size | Description |
---|---|---|---|
BAR0 | 0x28000000000 |
32 MB | MMIO |
BAR1 | 0x28080000000 |
256 MB | VRAM |
BAR2 | 0x28002000000 |
??? | RAMIN |
While BAR0 points to the MMIO register area, both BAR1 and BAR2 map to the same 256 MB DDR memory. The difference is that BAR2 offsets are reversed, starting from the end of the VRAM and going to the beginning in chunks of 512 KB. Following formulas can be used to convert a BAR1 offset into a BAR2 offset and vice-versa:
uint32_t addr_vram_to_pramin(uint32_t offset) { uint32_t vram_size = 0x10000000; // 256 MB uint32_t rev_size = 0x80000; // 512 KB return (offset - vram_size) ^ -rev_size; } uint32_t addr_ramin_to_vram(uint32_t offset) { uint32_t vram_size = 0x10000000; // 256 MB uint32_t rev_size = 0x80000; // 512 KB return vram_size - (offset - (offset % rev_size)) - rev_size + (offset % rev_size); }
The driver fills RAMIN with objects which can be either Engine objects or DMA objects, commonly known as FIFO objects. The first kind describe engines that do a particular task (e.g. 2D graphics, 3D graphics, memory copying, etc.) the latter describe a DMA-accessible location.
Certain methods require a DMA object in order to know which data to access. Rather than directly passing the RAMIN offset to the engine, the driver populates hash-table known as RAMHT which maps a unique handler to the RAMIN offset where the target DMA object is located.
The DMA objects contain information about the access type, the range size and starting offset. Taking into account the IO segments mapped by LV1, a DMA object can reference the following offsets:
Offset | Description |
---|---|
0x00000000 - 0x0FFFFFFF |
VRAM |
0x80000000 - 0x8FFFFFFF |
IOMMU (Context 0) |
0x90000000 - 0x9FFFFFFF |
IOMMU (Context 1) |
Exploit
RSX MMIO register mapping
The LV2 kernel provides the following syscall:
// LV2 SysCall 675 (0x2A3) uint64_t sys_rsx_device_map(uint64_t mmio_addr, uint64_t vram_addr, uint64_t device_id);
The table below lists the RSX devices that can be mapped through this syscall. The highlighted entries correspond to the devices involved in the vulnerability:
Device | MMIO | VRAM | Description | Control |
---|---|---|---|---|
5 | 0x08A000 |
---------- |
No | |
6 | 0x200000 |
---------- |
PMEDIA | No |
7 | 0x600000 |
---------- |
PCRTC | No |
8 | -------- |
0x0FF10000 |
No | |
9 | 0x400000 |
---------- |
PGRAPH | Yes |
10 | 0x100000 |
---------- |
PFB | Yes |
11 | 0x00A000 |
---------- |
PCOUNTER | Yes |
12 | 0x680000 |
---------- |
Yes | |
13 | 0x090000 |
---------- |
Yes | |
14 | 0x002000 |
---------- |
PFIFO | Yes |
15 | 0x088000 |
---------- |
IOIF | Yes |
By mapping the device 14, we can access the PFIFO MMIO registers from the userland code (or LV2 if ss.param.fself.control
prevents from doing that and the EEPROM cannot be patched). Among the many PFIFO registers listed in the Nouveau headers and documents, some of them struck as particularly dangerous if misused. These registers are described below:
0x002140
NV03_PFIFO_INTR_EN_0: Disable the interrupts that trigger LV1 panics.0x002210
NV03_PFIFO_RAMHT: Controls the size and RAMIN offset of RAMHT.0x002218
NV03_PFIFO_RAMRO: Controls the size and RAMIN offset of RAMRO.0x002504
NV04_PFIFO_MODE: Alternate between PIO and DMA mode in channels.
These register fields are described in detail here in nv1_pfifo.xml. CellOS-LV1 sets RAMHT at RAMIN offset 0x10000
and a 16 KB uin size and RAMRO at RAMIN offset 0x18000
with 512 bytes in size.
RAMHT manipulation attempt
Our best chance to create custom DMA objects is to create a RAMHT entry pointing to an accessible VRAM area. The first attempt to do so would be moving RAMHT to reinterpret other byte sequences as valid entries. By the information before, RAMHT can only be relocated in the range 0x0 to 0x1F000 and have an alignment of 4 KB. In order to get a valid RAMHT entry poiting to our VRAM area, we need to find 8 byte sequence satisfying:
- Reinterpreting the bits 31:23 (MSB:LSB) of the second word is equal to 1 (i.e. our application's PFIFO channel).
- Reinterpreting the bits 19:0 (MSB:LSB) of the second word is a value in range
[0x20000-0xFFFFF]
(mappable VRAM). - Calculating the RAMHT offset minus the entry offset results in a multiple of 4 KB.
These conditions are hard to satisfy and aside from unlikely random values that might have been written during memtest, they will not be found in this range.
RAMRO as RAMHT entry generator
However, there is still a way to get such entries in RAMHT. RAMRO can only be relocated in the range 0x0 to 0x1FE00 and have an alignment of 512 byte. The submission of invalid PFIFO commands causes 8 byte writes in RAMRO in which the first word holds the error report and the second word the submitted argument. We can control the argument and predict the error report, thus being able to generate valid RAMHT entries. In order to preserve the integrity of RAMHT we should ensure that no existing entry is overwritten:
- Invalid PFIFO methods that trigger RAMRO writes in PIO mode are: { 0x0040, 0x0044, 0x0048, 0x0054 }.
- Their corresponding RAMRO error reports are { 0x50401040, 0x50401044, 0x50401048, 0x50401054 }.
- Their corresponding RAMHT offset for channel 1 are: { 0x0C18, 0x0C38, 0x0C58, 0x0CB8 }.
After computing the RAMHT offsets for all pairs consisting of any handles ever created by the LV1 driver and any possible channels ID (up to the maximum of 4 that LV1 supports), we know that no handle will ever be placed by the driver in the RAMHT range 0xC00
- 0xCFF
(note that 0xC00
is 512 byte aligned). Threfore RAMRO could be moved inside RAMHT without fearing a collision.
Accessing custom DMA objects
The reserved VRAM for vsh.self
(VirtualShell/XMB), i.e. channel 0, is allocated from the front and the remaining VRAM aside from the first 2 MB of RAMIN is assigned to the application, i.e. channel 1, by the GCM library. Therefore any RAMIN offset bigger than 2 MB assigned to channel 1 will lie in an accessible VRAM area. E.g.:
0x00808000 == (1 /*Channel ID*/ << 23) | (0x800000 /*RAMIN offset at 8 MB*/ >> 4)
The only remaining step is placing our custom DMA object in that offset. Finally a combination of the PFIFO puller methods can be used to trigger a write in our custom DMA range:
0x0060
NV406E_SET_CONTEXT_DMA_SEMAPHORE: Set DMA object handle (i.e. the0x504010XX
reports above)0x0064
NV406E_SEMAPHORE_OFFSET: Set the offset we want to write in.0x006C
NV406E_SEMAPHORE_RELEASE: Write the specified value there.
If the specified value ends up at said offset in the range specified by our DMA object the exploit succeeded.